AITopics | Borsod-Abaúj-Zemplén County

Collaborating Authors

Borsod-Abaúj-Zemplén County

Rate of Model Collapse in Recursive Training

Suresh, Ananda Theertha, Thangaraj, Andrew, Khandavally, Aditya Nanda Kishore

arXiv.org Machine LearningDec-23-2024

Given the ease of creating synthetic data from machine learning models, new models can be potentially trained on synthetic data generated by previous models. This recursive training process raises concerns about the long-term impact on model quality. As models are recursively trained on generated data from previous rounds, their ability to capture the nuances of the original human-generated data may degrade. This is often referred to as \emph{model collapse}. In this work, we ask how fast model collapse occurs for some well-studied distribution families under maximum likelihood (ML or near ML) estimation during recursive training. Surprisingly, even for fundamental distributions such as discrete and Gaussian distributions, the exact rate of model collapse is unknown. In this work, we theoretically characterize the rate of collapse in these fundamental settings and complement it with experimental evaluations. Our results show that for discrete distributions, the time to forget a word is approximately linearly dependent on the number of times it occurred in the original corpus, and for Gaussian models, the standard deviation reduces to zero roughly at $n$ iterations, where $n$ is the number of samples at each iteration. Both of these findings imply that model forgetting, at least in these simple distributions under near ML estimation with many samples, takes a long time.

artificial intelligence, machine learning, recursive training process, (16 more...)

arXiv.org Machine Learning

2412.17646

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Hungary > Borsod-Abaúj-Zemplén County > Miskolc (0.04)

Genre: Research Report > New Finding (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Neural Networks for Vehicle Routing Problem

Kovács, László, Jlidi, Ali

arXiv.org Artificial IntelligenceSep-17-2024

Abstract: The Vehicle Routing Problem is about optimizing the routes of vehicles to meet the needs of customers at specific locations. The route graph consists of depots on several levels and customer positions. Several optimization methods have been developed over the years, most of which are based on some type of classic heuristic: genetic algorithm, simulated annealing, tabu search, ant colony optimization, firefly algorithm. Recent developments in machine learning provide a new toolset, the rich family of neural networks, for tackling complex problems. The main area of application of neural networks is the area of classification and regression. Route optimization can be viewed as a new challenge for neural networks. The article first presents an analysis of the applicability of neural network tools, then a novel graphical neural network model is presented in detail. The efficiency analysis based on test experiments shows the applicability of the proposed NN architecture.

artificial intelligence, machine learning, neural network, (16 more...)

arXiv.org Artificial Intelligence

2409.1129

Country: Europe > Hungary > Borsod-Abaúj-Zemplén County > Miskolc (0.05)

Genre: Research Report (0.82)

Industry: Transportation > Freight & Logistics Services (0.87)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Navigating Process Mining: A Case study using pm4py

Jlidi, Ali, Kovács, László

arXiv.org Artificial IntelligenceSep-17-2024

Process-mining techniques have emerged as powerful tools for analyzing event data to gain insights into business processes. In this paper, we present a comprehensive analysis of road traffic fine management processes using the pm4py library in Python. We start by importing an event log dataset and explore its characteristics, including the distribution of activities and process variants. Through filtering and statistical analysis, we uncover key patterns and variations in the process executions. Subsequently, we apply various process-mining algorithms, including the Alpha Miner, Inductive Miner, and Heuristic Miner, to discover process models from the event log data. We visualize the discovered models to understand the workflow structures and dependencies within the process. Additionally, we discuss the strengths and limitations of each mining approach in capturing the underlying process dynamics. Our findings shed light on the efficiency and effectiveness of road traffic fine management processes, providing valuable insights for process optimization and decision-making. This study demonstrates the utility of pm4py in facilitating process mining tasks and its potential for analyzing real-world business processes.

algorithm, event log, process model, (12 more...)

arXiv.org Artificial Intelligence

2409.11294

Country:

Europe > Hungary > Borsod-Abaúj-Zemplén County > Miskolc (0.05)
North America > United States (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
(4 more...)

Genre:

Workflow (0.67)
Research Report > New Finding (0.34)

Industry:

Materials > Metals & Mining (0.67)
Education (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

SPOT: Text Source Prediction from Originality Score Thresholding

Yvinec, Edouard, Kasser, Gabriel

arXiv.org Artificial IntelligenceMay-30-2024

The wide acceptance of large language models (LLMs) has unlocked new applications and social risks. Popular countermeasures aim at detecting misinformation, usually involve domain specific models trained to recognize the relevance of any information. Instead of evaluating the validity of the information, we propose to investigate LLM generated text from the perspective of trust. In this study, we define trust as the ability to know if an input text was generated by a LLM or a human. To do so, we design SPOT, an efficient method, that classifies the source of any, standalone, text input based on originality score. This score is derived from the prediction of a given LLM to detect other LLMs. We empirically demonstrate the robustness of the method to the architecture, training data, evaluation data, task and compression of modern LLMs.

evaluation, llm, opt 6, (15 more...)

arXiv.org Artificial Intelligence

2405.20505

Country:

Asia > Thailand > Nong Khai > Nong Khai (0.05)
Europe > Hungary > Borsod-Abaúj-Zemplén County (0.04)
Asia > Cambodia (0.04)
(7 more...)

Genre: Research Report > New Finding (0.88)

Industry:

Education (1.00)
Leisure & Entertainment > Sports > Soccer (0.68)
Media (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.99)

Add feedback

Large Language Model (LLM) AI text generation detection based on transformer deep learning algorithm

Mo, Yuhong, Qin, Hao, Dong, Yushan, Zhu, Ziyi, Li, Zhenglin

arXiv.org Artificial IntelligenceApr-6-2024

In this paper, a tool for detecting LLM AI text generation is developed based on the Transformer model, aiming to improve the accuracy of AI text generation detection and provide reference for subsequent research. Firstly the text is Unicode normalised, converted to lowercase form, characters other than non-alphabetic characters and punctuation marks are removed by regular expressions, spaces are added around punctuation marks, first and last spaces are removed, consecutive ellipses are replaced with single spaces and the text is connected using the specified delimiter. Next remove non-alphabetic characters and extra whitespace characters, replace multiple consecutive whitespace characters with a single space and again convert to lowercase form. The deep learning model combines layers such as LSTM, Transformer and CNN for text classification or sequence labelling tasks. The training and validation sets show that the model loss decreases from 0.127 to 0.005 and accuracy increases from 94.96 to 99.8, indicating that the model has good detection and classification ability for AI generated text. The test set confusion matrix and accuracy show that the model has 99% prediction accuracy for AI-generated text, with a precision of 0.99, a recall of 1, and an f1 score of 0.99, achieving a very high classification accuracy. Looking forward, it has the prospect of wide application in the field of AI text detection.

accuracy, ai text generation detection, ai-generated text, (12 more...)

arXiv.org Artificial Intelligence

2405.06652

Country:

Oceania > Fiji (0.04)
North America > United States > Texas (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(5 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Graphs Unveiled: Graph Neural Networks and Graph Generation

Kovács, László, Jlidi, Ali

arXiv.org Artificial IntelligenceMar-18-2024

Embarking on the exploration of machine learning applied to graphs [1] invites us into a realm where graphs, representing connections between objects (nodes), become a universal language for deciphering complex systems [2]. For instance, in a social network graph, individuals are nodes, and friendships are edges. The power of this concept becomes evident in historical studies, like Wayne W. Zachary's analysis of a karate club's dynamics [3], predicting factional splits based on the graph structure. What makes graphs versatile is their ability to represent various interactions, be it in social networks, biology, or even telecommunications. Now, as we step into the world of machine learning, graphs become more than visual representations.

graph, graph neural network, node, (12 more...)

arXiv.org Artificial Intelligence

doi: 10.32968/psaie.2023.1.5

2403.13849

Country: Europe > Hungary > Borsod-Abaúj-Zemplén County > Miskolc (0.05)

Genre:

Research Report (0.50)
Overview (0.46)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Link between Coding Theory and Cross-Validation with Applications

Pahikkala, Tapio, Movahedi, Parisa, Montoya, Ileana, Miikonen, Havu, Foldes, Stephan, Airola, Antti, Major, Laszlo

arXiv.org Artificial IntelligenceFeb-9-2024

How many different binary classification problems a single learning algorithm can solve on a fixed data with exactly zero or at most a given number of cross-validation errors? While the number in the former case is known to be limited by the no-free-lunch theorem, we show that the exact answers are given by the theory of error detecting codes. As a case study, we focus on the AUC performance measure and leave-pair-out cross-validation (LPOCV), in which every possible pair of data with different class labels is held out at a time. We show that the maximal number of classification problems with fixed class proportion, for which a learning algorithm can achieve zero LPOCV error, equals the maximal number of code words in a constant weight code (CWC), with certain technical properties. We then generalize CWCs by introducing light CWCs, and prove an analogous result for nonzero LPOCV errors and light CWCs. Moreover, we prove both upper and lower bounds on the maximal numbers of code words in light CWCs. Finally, as an immediate practical application, we develop new LPOCV based randomization tests for learning algorithms that generalize the classical Wilcoxon-Mann-Whitney U test.

algorithm, hypothesis, orientation, (16 more...)

arXiv.org Artificial Intelligence

2103.11856

Country:

Europe > Finland > Southwest Finland > Turku (0.04)
North America > United States > Texas > Dallas County > Dallas (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.93)
Health & Medicine > Diagnostic Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.81)

Add feedback

Presence of informal language, such as emoticons, hashtags, and slang, impact the performance of sentiment analysis models on social media text?

Ganie, Aadil Gani

arXiv.org Artificial IntelligenceJan-28-2023

This study aimed to investigate the influence of the presence of informal language, such as emoticons and slang, on the performance of sentiment analysis models applied to social media text. A convolutional neural network (CNN) model was developed and trained on three datasets: a sarcasm dataset, a sentiment dataset, and an emoticon dataset. The model architecture was held constant for all experiments and the model was trained on 80% of the data and tested on 20%. The results revealed that the model achieved an accuracy of 96.47% on the sarcasm dataset, with the lowest accuracy for class 1. On the sentiment dataset, the model achieved an accuracy of 95.28%. The amalgamation of sarcasm and sentiment datasets improved the accuracy of the model to 95.1%, and the addition of emoticon dataset has a slight positive impact on the accuracy of the model to 95.37%. The study suggests that the presence of informal language has a restricted impact on the performance of sentiment analysis models applied to social media text. However, the inclusion of emoticon data to the model can enhance the accuracy slightly.

machine learning, natural language, sentiment analysis model, (15 more...)

arXiv.org Artificial Intelligence

2301.12303

Country: Europe > Hungary > Borsod-Abaúj-Zemplén County > Miskolc (0.05)

Genre: Research Report > New Finding (0.69)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.51)
Information Technology > Services (0.47)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Intransitively winning chess players positions

Poddiakov, Alexander

arXiv.org Artificial IntelligenceDec-11-2022

Positions of chess players in intransitive (rock-paper-scissors) relations are considered. Namely, position A of White is preferable (it should be chosen if choice is possible) to position B of Black, position B of Black is preferable to position C of White, position C of White is preferable to position D of Black, but position D of Black is preferable to position A of White. Intransitivity of winningness of positions of chess players is considered to be a consequence of complexity of the chess environment -- in contrast with simpler games with transitive positions only. The space of relations between winningness of positions of chess players is non-Euclidean. The Zermelo-von Neumann theorem is complemented by statements about possibility vs. impossibility of building pure winning strategies based on the assumption of transitivity of positions of chess players. Questions about the possibility of intransitive positions of players in other positional games are raised.

artificial intelligence, game theory, magician, (17 more...)

arXiv.org Artificial Intelligence

2212.11069

Country:

Europe > Hungary > Veszprém County > Veszprém (0.04)
Europe > Hungary > Csongrád-Csanád County > Szeged (0.04)
Europe > Hungary > Borsod-Abaúj-Zemplén County > Miskolc (0.04)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment > Games > Chess (1.00)

Technology:

Information Technology > Artificial Intelligence (0.48)
Information Technology > Game Theory (0.46)

Add feedback

Transfer Learning with Uncertainty Quantification: Random Effect Calibration of Source to Target (RECaST)

Hickey, Jimmy, Williams, Jonathan P., Hector, Emily C.

arXiv.org Machine LearningNov-29-2022

Transfer learning uses a data model, trained to make predictions or inferences on data from one population, to make reliable predictions or inferences on data from another population. Most existing transfer learning approaches are based on fine-tuning pre-trained neural network models, and fail to provide crucial uncertainty quantification. We develop a statistical framework for model predictions based on transfer learning, called RECaST. The primary mechanism is a Cauchy random effect that recalibrates a source model to a target population; we mathematically and empirically demonstrate the validity of our RECaST approach for transfer learning between linear models, in the sense that prediction sets will achieve their nominal stated coverage, and we numerically illustrate the method's robustness to asymptotic approximations for nonlinear models. Whereas many existing techniques are built on particular source models, RECaST is agnostic to the choice of source model. For example, our RECaST transfer learning approach can be applied to a continuous or discrete data model with linear or logistic regression, deep neural network architectures, etc. Furthermore, RECaST provides uncertainty quantification for predictions, which is mostly absent in the literature. We examine our method's performance in a simulation study and in an application to real hospital data.

artificial intelligence, machine learning, target data, (17 more...)

arXiv.org Machine Learning

2211.16557

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > North Carolina (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.88)

Industry: Health & Medicine > Health Care Technology > Medical Record (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback